The estimated mean of a Gaussian distribution with varying sampling sizes.
reast cancer data set with 569 samples [Wolberg, et al., 1994;
et al., 1995] was also used to demonstrate how the random
approach can help reach the real data proportion. Among these
ples, 212 were malignant tumours. The malignancy ratio was
0.373, i.e., 37.3% were malignant tumours in this data set. Many
alf of samples were drawn from this data set randomly. The
cy ratio was calculated within the drawn sample for varying
times from ten to 1,000. For instance, for K sampling times, K
were drawn. K malignancy ratios were calculated for K samples.
ds, the mean values of K malignancy ratios were recorded. It was
when the times of random sampling was increasing, the
d malignancy ratio within the drawn sample should approach to
atio, i.e., 0.373. Figure 3.14 shows this simulation. From this plot,
seen that when the sampling times increased, the estimated
cy ratio among randomly drawn samples was indeed approaching
The malignancy ratio within sampled data with varying sampling times for the
er data set.
scussed above, it can be seen that the random sampling approach
d many repeats to reach a reasonable approximate of real value of